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The GMAT Exam Is Not Getting Easier: The Fallacy of Score Increases and the Impact of Score Preview 


Executive Summary 


The Graduate Management Admission Test® (cmar®) exam 
was developed to assess skills most relevant to student 
success in graduate business programs, and to help business 
schools select qualified applicants. First used by 54 business 
schools as a criterion for admission, the test is now used by 
more than 6,500 programs worldwide and taken each year by 
more than 200,000 candidates exclusively for application to 
graduate business programs. 


In recent years, the average GMaT exam score of the 
admitted class at leading graduate business programs has 
risen, raising questions about the integrity of GMaT exam 
scores and whether the GMar exam is getting easier. This 
report explores whether and how GMart exam scores have 
changed over time. 


GMat exam scores for citizens of eight countries' — those 
which have consistently accounted for about 80 percent of 
exams taken each year since testing year (ry) 2011” — were 
analyzed and it can be demonstrated that average GMaT 
exam scores have remained stable. The exam is not getting 
any easier. What has changed, however, is the profile of test 
takers. Changing demographics are related to the broadening 
of the applicant pool for graduate business education, and 
hence what defines the “average” examinee. Shifts in 
underlying candidate demographics have had a small, but 
predictable, impact on calculated average scores. 


Given the Gar exam is not getting any easier, what factors 
might contribute to rising average scores among admitted 
classes? Analysis indicates that the recently introduced score 
preview feature, which enables car test takers to select 


which of their test scores a school can see, is having an 
impact. Lower test scores are removed from the pool through 
student cancellation and a greater number of higher scores 
are reported to schools, a phenomenon common across all 
program groups analyzed in this study. 


Although higher average GMaT exam scores among the 
admitted candidate pool are associated with an increase 

in candidate quality, there can be an adverse impact. As 

the number of higher-scoring candidates fills the applicant 
pool, programs can choose from among a larger number 

of these candidates, thus resulting in a rise in reported 
program averages. Candidates not meeting the program’s 
typical score profile may therefore decide to target different 
programs, retake the exam, or opt out of the application 
process altogether. 


Average GMAT exam scores cannot rise indefinitely and hence 
the cycle will break. The challenge for graduate business 
schools and the industry is how to effectively communicate 
the role that standardized assessments like the GMar exam 
play in admissions, and to give candidates greater insight into 
the full range of scores among the applicant pool and what 
that means for them. 


‘The eight countries are the United States, Canada, United Kingdom, France, Germany, China, India and South Korea. 


"The GMAT exam volumes are reported in testing years that mirror the academic calendar. For example, testing year 2011 ran from July 1, 2010 to June 30, 2011. 
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Demographic Analysis of 
GMAT Testing Population 


As the number and types of graduate business programs 
have expanded and diversified (Figure 1), so too have 
the types of students applying to them. The candidate 
pool for graduate business education has become more 
heterogeneous and more globalized. 


Increased candidate diversity is good for graduate business 
education. One less considered impact of greater diversity, 
however, is how it affects reported average GMAT exam 
scores across populations given that different demographic 
groups perform differently. For example, women typically 
have a lower average total score than men (542 vs. 560 

in Ty2016). If more women take the GmMat exam, averages 
across the test-taker pool will decrease and scores will 
appear to decline. Younger candidates, on the other hand, 
typically tend to score higher, so ifa greater proportion 

of younger people take the exam, average GMAT scores 
would appear to increase. Citizenship, undergraduate major, 
country of residency, and a myriad of other factors impact 
average scores. 


To empirically evaluate whether and how cMar exam scores 
have changed, the demography of the testing population 
was analyzed using GMaT exam scores for citizens of the 
eight countries that have consistently accounted for about 
80 percent of exams taken each year since Ty 2011. These 
include the United States, Canada, France, Germany, 
United Kingdom, China, India, and South Korea. A new 
TY2016 score average was then calculated using the same 
demographic distribution but with historical average 
scores. If the projected and actual GmarT score averages 
are approximately equal, the test has not gotten any easier. 
Larger average score differences would indicate that an 
underlying score shift has occurred. (See Appendix A for 
further details on the methodology.) 


Figure 1: MBA, executive MBA, and master’s programs receiving one or more GMAT exam scores 
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Findings 


Test-Taker Demographics 


Between Ty2012 and Ty2016, among citizens of the eight 
countries being studied, there has been growth in both the 
number of female examinees (+2.8 percent) and younger test 
takers (+2.5 percent). Over this period, the share of exams 
taken by women aged 24 or younger and by women ages 

25 to 30 rose by 2.6 percent and 07 percent, respectively. 
Exams taken by men in the same age categories changed by 
0.2 percent and -1.3 percent, respectively (Figure 2). 


Figure 2: Changes in examinee demographics, TY2012 to TY2016 
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Among demographic segments representing more than 
250 exams in Ty2016, those with the greatest growth rates 
in test taking are found in China and India among pre- or 
early-career candidates with a business undergraduate 
degree (Figure 3). In contrast, the greatest declines in test 
taking are typically among older U.S. examinees over the 


age of 4o (Figure 4). 


Figure 3: Demographic GMAT segments with greatest percentage growth in test taking between 


TY2012 and TY2016 
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Figure 4: Demographic GMAT segments with greatest percentage decline in test taking between 


TY2012 and TY2016 
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Projecting TY2016 Scores Using 
Historical Data 


Clearly, there have been changes in candidate 
demographics in recent years. If, however, these are held 
constant, do GMar exam scores remain the same?* 


Five years of cmar data from Ty2011 to Ty2015 were used to 
recalculate Ty2016 score averages (see Appendix B, Figure 
B.2), and across all testing years only minor variances 
were noted: The average difference between actual and 
projected cMar scores for the eight countries examined 
was 2.43. Analysis reveals that U.S. examinees appear to 
be performing at a slightly higher level than expected as, 
across all years, the average difference between projected 
and actual is -9.5. 


In Figure 5, actual Ty2016 average scores by citizenship 
(x-axis) are mapped against the projected average scores 
calculated using historic data (y-axis). The diagonal line 
represents equivalence (i.e., when actual and projected scores 
are the same) and the distance from this line illustrates the 
difference between actual and projected performance. 


When using Ty2012 data to project a Ty2016 average, 
Chinese and Indian candidates would be expected to score 
higher than they did: In ry2016, Chinese candidates scored, 
on average, 581 yet using Ty2012 data resulted in a projected 
average score of 589. For Indian candidates, the two values 
were 577 and 582, respectively. In contrast, French citizens 
performed slightly better than expected: Average Ty2016 
scores for the population analyzed were 566 yet projections 
based on ty2012 data estimated an average score of 556 
(Figure B.2, Appendix B). 


Though a difference of 10 points may seem large, it is 
important to note that the GMar exam is scored in 10-point 
increments. Ten points are therefore one increment and 
equivalent to a +1.7 percent change in score when considered 
across the score range. Within this context, it is a relatively 
small difference, possibly attributable to factors such as greater 
familiarity with the exam format, increased preparedness, and 
the impact of changing examinee demographics. 


GMAT score performance has therefore not changed, but 
what has changed are the profiles of GMaT examinees. 
Changes in the underlying candidate demographics have 
therefore had a small and predictable impact on calculated 
average scores. 


‘When using historical data to recalculate the average GMar score in Ty2016, defining a minimum segment size helps to mitigate the effect of small numbers of examinees. 
For the projected Ty2016 average scores discussed in the following sections, the minimum segment size was set at 10. 
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Figure 5: Mapping actual (TY2016) against projected GMAT scores (historical data) 
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Score Preview 


Ifthe car exam is not getting any easier, then what 
factors might contribute to rising average scores at top-tier 
business schools? 


As demonstrated in the previous analysis, the changing 
demographic profile of mar examinees is having an impact 
on average GMart scores. Another factor, however, that might 
impact average Gmart scores is the new (2014) score preview 
feature: Candidates can preview their score and decide 
whether to keep or cancel it. Are top-tier programs simply 
seeing more high scores —and are therefore able to cherry 
pick the best— or are they seeing fewer lower scores as 
candidates take themselves out of the application pipeline 
by deciding not to send a score to a school? Comparing 

two years of GMa score-sending data— Ty2014 (prior to 

the introduction of score preview) and Ty2016 (when score 
preview was available) — provides some insight. 


The Dataset 


To simplify analysis, three groups of programs were created: 


* Group A~global programs, ranked highly around the world 


- Group B-leading regional programs, may be 
internationally ranked 


- Group C-leading domestic programs 


Each group included 10 full-time mpa programs (or local 
equivalent) offered by a mix of schools from around the 
world (five U.S., one Canadian, two European, and two Asian). 


The test-taking and score-sending behavior of candidates 
who sent Gar scores to one or more of these programs 
during Ty2014 and Ty2016, and who were citizens of one of 
the eight countries being evaluated (United States, Canada, 
France, Germany, United Kingdom, China, India, South 
Korea), was analyzed. 


Figure 6 shows the overlap in the percentage of scores sent 
to programs in Groups A, B, and C. The relatively low level 

of overlap indicates that, at least among the forced grouping 
applied, there are distinct groups of candidates. 


TY2014 and TY2016 Summary Data 


In ty2014, the dataset consisted of 52,476 exams, taken by 
39,483 unique candidates who were citizens of the United 
States, Canada, France, Germany, United Kingdom, China, 
India, and South Korea. Their average GMat score was 613. 
Among those sending scores to Group A, the average score 
rose slightly to 629 (vs. 598 for Group B, and 558 for Group C). 


In Ty2016, 34,101 unique candidates from the same countries 
who had sent one or more scores to the three program 
groups created sat for a total of 50,153 exams. The average 


Figure 6: Overlap in GMAT score sending (TY2014 and TY2016), Groups A, B, and C 
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TY2016 (50,153 exams) 
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score across all exams was 628: 636 for reported scores, 


593 for canceled ones (Figure 7). Among candidates sending 


scores to Group A, the average was 651 (651 for reported; 
603 for canceled) and approximately 1 in 20 scores (19%) 
were canceled. 


Figure 7: Summary TY2014 and TY2016 
GMAT exam and score data 
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Findings 


Score Preview Impact on 
Group A Programs 


In ry2014, Group A programs received 69.8 percent of exam 
scores from the citizens of the eight countries being examined; 
this figure dropped to 60.4 percent in Ty2016. Fewer test takers 
therefore opted to send scores to these programs. 


Figure 8 illustrates the distribution of scores to one or 
more programs in Group A. In ry2014, a larger share of 
candidates scoring at or below 650 sent their scores to 
Group A programs when compared with ty2016. This 
therefore led to an increase in the share of scores coming 
from higher-scoring candidates: In ry2016, candidates 
scoring 660 or higher accounted for 57 percent of scores 
received by Group A schools (vs. 45 percent in Ty2014). 

If we assume previewed scores were sent (rather than 
canceled), this share would drop slightly to 52.5 percent, a 
higher proportion than in Ty2014. 


In terms of the volume, rather than share, of scores, the 
story is the same: Global ranked programs are attracting 
more scores from test takers scoring 690 or higher —13,262 
in TyY2016 versus 12,022 in Ty2014— despite there being fewer 
exams with this score in the dataset. 
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Figure 8: Distribution of GMAT scores, by band, sent to global ranked programs 
by citizens of 8 countries 
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Among the 26,269 unique test takers who sent scores to one 
or more Group A programs in ty2016, 1 in 6 had previewed 
and canceled scores during the testing year. Using the test 
taker’s actual score sending behavior during the same testing 
year to predict where the canceled scores might have been 
sent illustrates the impact of score preview on applicant pool 
statistics (Figure 9). 


It is apparent that the score preview feature available to 
car test takers resulted in the removal of some test scores 
from the Ty2016 pool (average score 606) and substitution 
of higher scores (average score 654). The distribution of 
scores from these retaken exams is similar to the scores 

of examinees who did not preview and cancel their score 
(average 651). 


The analysis indicates that roughly 1 in 6 candidates in the 
dataset who targeted one or more of the selected Group 

A programs, chose to cancel a score after previewing it. 
Those scoring 600 to 610 were most likely to preview and 


cancel, and gain an average of 48 points in their score on 
subsequent exams. The consequences of score preview are 
that as the number of higher-scoring candidates sending 
scores to Group A programs increases, these programs 

have more high-scoring candidates to choose from and so 
reported program averages rise, which, in turn, may drive 
lower scoring candidates away (or encourage them to retake). 
Unless schools intervene, the cycle will continue until it 
breaks. The challenge for these programs is determining how 
to address the issue of candidate self-selection and opt out. 


It must be noted that “lower scoring” in this context 

is relative. Based on percentile rankings, a candidate 
achieving 650 on the car exam has still scored higher 
than 76 percent of examinees. 


Figure 9: Distribution of GMAT scores intended for global ranked programs, by score band and 
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Group B and C Programs 


A similar shift can be seen among candidates who 
target Group B (leading regional) and Group C (leading 
domestic) programs. 


In Ty2014, Group B programs received 19.5 percent 

of reportable exam scores by the citizens of the eight 
countries being examined; this figure dropped to 16.4 
percent in ry2016, the year score preview was introduced. 
For Group C programs the numbers are 8.9 percent and 6.7 
percent, respectively. 


A 
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In contrast with Ty2014, Ty2016 examinees who scored 630 
and above were more likely to send their scores to leading 
Group B regional programs (53.2 percent in Ty2016; 44.6 
percent in Ty2014). This may be the result of candidates, 
who previously set their sights on globally ranked (Group 
A) programs, revising their outlook, scoring higher than 
expected, or hedging their bets. Among score senders to 
leading (Group C) domestic programs, the shift in behavior 
typically at a lower score as those scoring 600 or above 
become more likely to send a score (39.6 percent in Ty2014; 
47-1 percent in Ty2016) (Figure 10). 


Figure 10: Distribution of GMAT scores intended for Group B and Group C programs, by citizens of 
8 countries 
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Score preview and cancellation is not unique to Group A 
program aspirants—17.5 percent of Group B candidates 

and 14.6 percent of Group C canceled one or more scores 

in Ty2016. As previously seen, a candidate’s actual score- 
sending behavior in the same testing year has been used to 
predict where the canceled scores might have been sent so 
that the impact on pool statistics can be evaluated (Figure 11). 


Among Group B and C programs, score preview is having 
a small impact on the applicant pool quality, as retaking 
candidates score slightly higher, on average, than those 
who never canceled a score (for Group A programs it was 
about the same): 


- Group B: Ifa candidate has previewed and canceled one 
or more scores, their reportable score average is 631 
compared with 612 among those with no cancellations. 
Canceled scores average 579. 


- Group C: Ifa candidate has previewed and canceled one 
or more scores, their reportable score average is 593 
compared with 571 among those with no cancellations. 
Canceled scores average 539. 
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Figure 11: Distribution of GMAT scores intended for Group B and C programs, by score band and 
exam status 
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Conclusion 


As the average GMatT exam scores of the admitted classes of 
leading graduate business programs around the world rise, 
schools have understandably questioned whether the exam 
has gotten easier. The analysis presented in this report 
explores whether and how cart exam scores have changed 
over time. The method employed involved exploring the 
underlying demographics of the testing population and 
using them to calculate a new Ty2016 average using previous 
years’ performance data. The minimal variation between 
projected and actual average scores for the citizens of eight 
countries in the study sample clearly shows that the GMar 
exam maintains a consistent set of cognitive standards. 


What has changed, however, is the profile of test takers. 
Changing demographics are related to the broadening 

of graduate business education, and hence what defines 
the “average” examinee. Shifts in underlying candidate 
demographics have had a small, but predictable, impact on 
calculated average scores. 


Ifthe cmar exam is not getting any easier, what other factors 
might contribute to rising average scores at top-tier schools? 
Are these programs simply seeing more high scores—and 
are therefore able to cherry pick the best—or are they 
seeing fewer lower scores as candidates either target 
different programs or opt not to send a score, therefore 
leaving the application pipeline? Two years of score 
sending data—tTy2014 (before the GMar score preview 

was introduced) and ty2016 (when score preview was 
available)— were compared. 


20 Graduate Management Admission Council (Gmac) 


Analysis indicates that the score preview feature of the GMar 
exam, which enables candidates to select the exam scores 
they want a school to see and cancel scores they don’t want 
to share with schools, is contributing to perceptions of score 
increases. Lower scores are removed from the pool through 
cancellation and an increased number of higher scores are 
reported to schools, a phenomenon common across all three 
groups of programs analyzed. 


On one side, higher average GMaT exam scores among the 
admitted candidate pool are associated with an increase 

in quality, however, there can be an adverse impact on 
candidates. As the number of higher-scoring candidates 
fills the applicant pool, programs have a larger number of 
higher-scoring candidates to choose from and reported 
program averages rise. Candidates not meeting the 

typical cart score profile may therefore decide to target 
different programs, retake the GMar exam, or opt out of the 
application process altogether. Further analysis is required 
to determine whether and how specific demographic 
groups behave differently. 


Average scores cannot rise indefinitely and, at some point, 
the cycle will break. The challenge for graduate business 
schools and the industry is how to effectively communicate 
the role that standardized assessments like the GMat exam 
play in admissions, and give candidates greater insight into 
the full range of scores among the applicant pool and what 
that means for them. 


Appendix A: Methodology 
for Demographic Analysis 


To evaluate whether and how GMart exam scores have 
changed, a demographically driven comparison was carried 
out using annual caar testing data from Ty2011 to Ty2015 to 
recalculate the ry2016 average exam score. 


First, the examinee base was segmented along five 
dimensions (citizenship, residency status in the country of 
citizenship, gender, undergraduate study major, and age), 
using five-year testing data from citizens of eight countries 
that account for about 80 percent of global testing volumes. 
These include the United States, Canada, France, Germany, 
United Kingdom, China, India, and South Korea. 


Once these segments were created, GMaT performance 
data— average scores for the Quantitative and Verbal exam 
sections and Total score—were calculated for each segment 
and testing year. Using the distribution of candidates in 
Ty2016 across the different demographic subgroups, a new 
average score was then calculated using each year’s data 
(ry2011 to TY2015). 


It is possible that segments present in one year are not 
present in the comparison year. These are therefore 
excluded from the recalculation as there must be one 
or more candidates in each group across both years. 
The overlap measure indicates how much of the Ty2016 
examinee base is represented in earlier years. 
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Appendix B: Findings From 


Using TY2011 and TY2015 
Data to Project a ‘New’ 
TY2016 Average Score 


When using historical data to recalculate the average 
GMAT score in Ty2016 it is important to define how many 
examinees must be present in each segment for it to be 
used in the calculation. Setting a baseline segment size 
greater than one ensures that the effect of a small number 
of examinees can be mitigated. The tables present findings 


when minimum segment sizes of 1 and 10 are used. 


Figure B.1: Projections for recalculated average TY2016 GMAT scores‘ resulting from a minimum 


segment size of one 


TY2016 Actual TY2016 Using TY2011 Data TY2016 Using TY2012 Data 
Country Total Q Vv Overlap Total Q Vv Overlap Total Q Vv 
Canada 574 =38 31 99.9% 565 37 31 99.5% 569 37 31 
U.S. 547334 31 100.0% 534 33 30 100.0% 537 33 31 
China 581 47 23 100.0% 593 47 24 100.0% 589 47 23 
India S/n 2. 2, 99.9% 580 42 27 100.0% 582 43 27 
S. Korea 584 46 24 99.5% 577 45 24 99.5% 585 46 24 
Germany 579 =: 339 30 98.6% 570 38 30 99.4% 575 39 30 
France 566 39 29 98.9% 561 39 28 99.3% 556 38 28 
U.K. 599 +38 34 98.2% 592 38 34 98.6% 590 37 34 


Figure B.2: Projections for recalculated average TY2016 GMAT scores' resulting from a minimum 


segment size of 10 


TY2016 Actual TY2016 Using TY2011 Data TY2016 Using TY2012 Data 
Country Total Q Vv Overlap Total Q Vv Overlap Total Q Vv 
Canada 574 = 338 31 947% 565 37 31 947% 569 37 31 
U.S. 547-334 31 99.8% 534 33 30 99.8% 537 33 31 
China 581 47 23 99.8% 593 47 24 99.8% 589 47 23 
India S/n a2. 2, 99.3% 581 42 27 99.3% 582 43 27 
S. Korea 584 46 24 91.0% 579 46 24 91.0% 588 46 24 
Germany 579-339 30 92.6% 571 39 30 92.6% 577 39 30 
France 566 39 29 89.0% 561 39 28 89.0% 556 38 28 
U.K. 599 38 34 70.5% 596 38 34 70.5% 592 38 33 


4Total = GMAT Total score; Q = Quantitative reasoning score; V = Verbal reasoning score. 


5Total = GMAT Total score; Q = Quantitative reasoning score; V = Verbal reasoning score. 
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TY2016 Using TY2013 Data 


Overlap 


99.6% 
100.0% 
100.0% 

99.9% 

99.0% 

98.8% 

99.1% 

97.0% 


TY2016 Using TY2013 Data 


Overlap 


94.7% 
99.8% 
99.8% 
99.3% 
91.0% 
92.8% 
89.0% 
70.0% 


Total 
568 
535 
585 
577 
580 
575 
561 
594 


Total 
567 
535 
585 
577 
582 
577 
561 
601 


Q 
37 
33 
47 
42 
46 
39 
38 
38 


Q 
36.8 
32.9 

47 

42 

46 

39 

39 

38 


Vv Overlap 
31 99.7% 
30 100.0% 
23 100.0% 
27 100.0% 
23 99.3% 
30 99.3% 
29 98.9% 
34 98.1% 


Vv Overlap 
31 94.7% 
30 99.8% 
23 99.8% 
27 99.3% 
23 91.0% 
30 92.6% 
28 89.0% 
34 70.5% 


Total 
566 
539 
584 
576 
577 
579 
558 
594 


Total 
566 
539 
584 
576 
579 
580 
559 
597 


TY2016 Using TY2014 Data 


Q 

37 
33 
47 
42 
46 
39 
38 
38 


TY2016 Using TY2014 Data 


Q 
36.7 
33.2 

47 

42 

46 

39 

39 
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TY2016 Using TY2015 Data 


Overlap 
99.8% 
100.0% 
100.0% 
100.0% 
99.4% 
98.9% 
99.1% 
98.1% 


TY2016 Using TY2015 Data 


Overlap 
94.7% 
99.8% 
99.8% 
99.3% 
91.0% 
92.8% 
89.0% 
70.5% 


Total 
572 
543 
583 
578 
582 
580 
565 
592 


Total 
572 
543 
583 
579 
582 
580 
565 
596 


Q 

37 
34 
47 
42 
46 
39 
39 
38 
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